Building a Japanese-Chinese Dictionary Using Kanji/Hanzi Conversion

نویسندگان

  • Chooi-Ling Goh
  • Masayuki Asahara
  • Yuji Matsumoto
چکیده

A new bilingual dictionary can be built using two existing bilingual dictionaries, such as Japanese-English and English-Chinese to build Japanese-Chinese dictionary. However, Japanese and Chinese are nearer languages than English, there should be a more direct way of doing this. Since a lot of Japanese words are composed of kanji, which are similar to hanzi in Chinese, we attempt to build a dictionary for kanji words by simple conversion from kanji to hanzi. Our survey shows that around 2/3 of the nouns and verbal nouns in Japanese are kanji words, and more than 1/3 of them can be translated into Chinese directly. The accuracy of conversion is 97%. Besides, we obtain translation candidates for 24% of the Japanese words using English as a pivot language with 77% accuracy. By adding the kanji/hanzi conversion method, we increase the candidates by 9%, to 33%, with better quality candidates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese

Chinese characters are used both in Japanese and Chinese, which are called Kanji and Hanzi respectively. Chinese characters contain significant semantic information, a mapping table between Kanji and Hanzi can be very useful for many Japanese-Chinese bilingual applications, such as machine translation and cross-lingual information retrieval. Because Kanji characters are originated from ancient ...

متن کامل

Multilingual Conceptual Access to Lexicon based on Shared Orthography: An ontology-driven study of Chinese and Japanese

In this paper we propose a model for conceptual access to multilingual lexicon based on shared orthography. Our proposal relies crucially on two facts: That both Chinese and Japanese conventionally use Chinese orthography in their respective writing systems, and that the Chinese orthography is anchored on a system of radical parts which encodes basic concepts. Each orthographic unit, called han...

متن کامل

Master Thesis Analysis of the Effects of Japanese-Chinese Machine Translation with Kanji/Simplified Chinese Conversion

Currently, most Japanese-Chinese machine translations use English as an intermediary language. Because of lack of enough Japanese-Chinese bilingual dictionaries, there is less precise in Japanese-Chinese machine translation than Japanese-English or Chinese-English machine translations. In order to make translations smooth and adequate, it is necessary and efficient for native people to modify t...

متن کامل

Upt E X — Unicode Version of Pt E X with Cjk Extensions

upTEX is a Unicode extension of ASCII’s pTEX (a Japanese-localized TEX). It not only improves Japanese support, but also handles Chinese and Korean characters, i.e., Kanji (Hanzi, Hanja), Kana, CJK symbols, and Hangul with Unicode. Moreover, it can process multilingual typesetting of original LTEX with inputenc and Babel (Latin, Cyrillic, Greek, etc.) by switching its \kcatcode tables. This pap...

متن کامل

Building A Graphetic Dictionary For Japanese Kanji - Character Look-Up Based On Brush Strokes Or Stroke Groups, And The Display Of Kanji As Path Data

kanji contains too little information on how Reading and writing Japanese isn’t easy for Japanese and foreigners alike. While Japanese learn these skills at school, foreigners should be helped by good teaching material and dictionaries. Kanji lexica have to be very different from other dictionaries. Unfortunately existing lexica normally expect that the users already have a lot of information o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005